Method and device for displaying or searching for object in image and computer-readable storage medium

ABSTRACT

A method of representing an object appearing in a still or video image, by processing signals corresponding to the image, comprises deriving a plurality of numerical values associated with features appearing on the outline of an object starting from an arbitrary point on the outline and applying a predetermined ordering to said values to arrive at a representation of the outline.

TECHNICAL FIELD

The present invention relates to the representation of an objectappearing in a still or video image, such as an image stored in amultimedia database, especially for searching purposes, and to a methodand apparatus for searching for an object using such a representation.

BACKGROUND ART

In applications such as image or video libraries, it is desirable tohave an efficient representation and storage of the outline or shape ofobjects or parts of objects appearing in still or video images. A knowntechnique for shape-based indexing and retrieval uses Curvature ScaleSpace (CSS) representation. Details of the CSS representation can befound in the papers “Robust and Efficient Shape Indexing throughCurvature Scale Space” Proc. British Machine Vision conference, pp53-62, Edinburgh, UK, 1996 and “Indexing an Image Database by ShapeContent using Curvature Scale Space” Proc. IEE Colloquium on IntelligentDatabases, London 1996, both by F. Mokhtarian, S. Abbasi and J. Kittler,the contents of which are incorporated herein by reference.

The CSS representation uses a curvature function for the outline of theobject, starting from an arbitrary point on the outline. The curvaturefunction is studied as the outline shape is evolved by a series ofdeformations which smooth the shape. More specifically, the zerocrossings of the derivative of the curvature function convolved with afamily of Gaussian filters are computed. The zero crossings are plottedon a graph, known as the Curvature Scale Space, where the x-axis is thenormalised arc-length of the curve and the y-axis is the evolutionparameter, specifically, the parameter of the filter applied. The plotson the graph form loops characteristic of the outline. Each convex orconcave part of the object outline corresponds to a loop in the CSSimage. The co-ordinates of the peaks of the most prominent loops in theCSS image are used as a representation of the outline.

To search for objects in images stored in a database matching the shapeof an input object, the CSS representation of an input shape iscalculated. The similarity between an input shape and stored shapes isdetermined by comparing the position and height of the peaks in therespective CSS images using a matching algorithm.

A problem with the known CSS representation is that the peaks for agiven outline are based on the curvature function which is computedstarting from an arbitrary point on the outline. If the starting pointis changed, then there is a cyclic shift along the x-axis of the peaksin the CSS image. Thus, when a similarity measure is computed, allpossible shifts need to be investigated, or at least the most likelyshift. This results in increased complexity in the searching andmatching procedure.

Accordingly the present invention provides a method of representing anobject appearing in a still or video image, by processing signalscorresponding to the image, the method comprising deriving a pluralityof numerical values associated with features appearing on the outline ofan object starting from an arbitrary point on the outline and applying apredetermined ordering to said values to arrive at a representation ofthe outline. Preferably, said values are derived from a CSSrepresentation of said outline, and preferably they correspond to theCSS peak values.

As a result of the invention, the computation involved in matchingprocedures can be greatly reduced, without a significant reduction inthe retrieval accuracy.

DISCLOSURE OF INVENTION

A method of representing an object appearing in a still or video image,by processing signals corresponding to the image described herein, themethod comprises deriving a plurality of numerical values associatedwith features appearing on the outline and applying a predeterminedordering to said values to arrive at a representation of the outline.

In a method described herein, the predetermined ordering is such thatthe resulting representation is independent of the starting point on theoutline.

In a method described herein, the numerical values reflect points ofinflection on the outline.

In a method described herein, a curvature scale space representation ofthe outline is obtained by smoothing the outline in a plurality ofstages using a smoothing parameter sigma, resulting in a plurality ofoutline curves, using values for the maxima and minima of the curvatureof each outline curve to derive curves characteristic of the originaloutline, and selecting the coordinates of peaks of said characteristiccurves as said numerical values.

In a method described herein, the coordinates of the characteristiccurves correspond to an arc-length parameter of the outline and thesmoothing parameter.

In a method described herein, the peak coordinate values are ordered onthe basis of the peak height values, corresponding to the smoothingparameter.

In a method described herein, the values are ordered starting from thegreatest value.

In a method described herein, the values are ordered in decreasing size.

In a method described herein, the values are ordered starting from thesmallest value.

A method of representing an object appearing in a still or video image,by processing signals corresponding to the image described herein, themethod comprises deriving a plurality of numerical values associatedwith features appearing on the outline of an object to represent saidoutline and deriving a factor indicating the reliability of saidrepresentation using a relationship between at least two of said values.

In a method described herein, the factor is based on the ratio betweentwo of said values.

In a method described herein, the ratio is of two greatest values.

In a method described herein, a curvature scale space representation ofthe outline is obtained by smoothing the outline in a plurality ofstages using a smoothing parameter sigma, resulting in a plurality ofoutline curves, using values for the maxima and minima of the curvatureof each outline curve to derive curves characteristic of the originaloutline, and selecting the coordinates of peaks of said characteristiccurves as said numerical values.

The values are derived using a method as described herein.

A method of searching for an object in a still or video image byprocessing signals corresponding to images as described herein, themethod comprises inputting a query in the form of a two-dimensionaloutline, deriving a descriptor of said outline using a method asdescribed herein, obtaining a descriptor of objects in stored imagesderived using a method as described herein and comparing said querydescriptor with each descriptor for a stored object, and selecting anddisplaying at least one result corresponding to an image containing anobject for which the comparison indicates a degree of similarity betweenthe query and said object.

A factor is derived for the query outline and for each stored outlineusing a method as described herein, and the comparison is made using thepredetermined ordering only or the predetermined ordering and some otherordering depending on said factors.

A method of representing a plurality of objects appearing in still orvideo images, by processing signals corresponding to the imagesdescribed herein, the method comprises deriving a plurality of numericalvalues associated with features appearing on the outline of each objectand applying the same predetermined ordering to said values for eachoutline to arrive at a representation of each outline.

An apparatus is adapted to implement a method as described herein.

A computer program implements a method as described herein.

A computer system is programmed to operate according to a method asdescribed herein.

A computer-readable storage medium set forth in claim 21 storescomputer-executable process steps for implementing a method as describedherein.

A method of representing objects in still or video images set forth inclaim 22 is described with reference to the accompanying drawings.

A method of searching for objects in still or video images set forth inclaim 23 is described with reference to the accompanying drawings.

A computer system set forth in claim 24 is described with reference tothe accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a video database system;

FIG. 2 is a drawing of an outline of an object;

FIG. 3 is a CSS representation of the outline of FIG. 2; and

FIG. 4 is a block diagram illustrating a searching method.

BEST MODE FOR CARRYING OUT THE INVENTION

First Embodiment

FIG. 1 shows a computerised video database system according to anembodiment of the invention. The system includes a control unit 2 in theform of a computer, a display unit 4 in the form of a monitor, apointing device 6 in the form of a mouse, an image database 8 includingstored still and video images and a descriptor database 10 storingdescriptors of objects or parts of objects appearing in images stored inthe image database 8.

A descriptor for the shape of each object of interest appearing in animage in the image database is derived by the control unit 2 and storedin the descriptor database 10. The control unit 2 derives thedescriptors operating under the control of a suitable programimplementing a method as described below.

Firstly, for a given object outline, a CSS representation of the outlineis derived. This is done using the known method as described in one ofthe papers mentioned above.

More specifically, the outline is expressed by a representationΨ={(x(u), y(u), uε [0, 1]} where u is a normalised arc length parameter.

The outline is smoothed by convolving Ψ with an ID Gaussian kernel g(u,σ), and the curvature zero crossings of the evolving curve are examinedas σ changes. The zero crossing are identified using the followingexpression for the curvature:${k\left( {u,\sigma} \right)} = \frac{{{X_{u}\left( {u,\sigma} \right)}{Y_{uu}\left( {u,\sigma} \right)}} - {{X_{uu}\left( {u,\sigma} \right)}{Y_{u}\left( {u,\sigma} \right)}}}{\left( {{X_{u}\left( {u,\sigma} \right)}^{2} + {Y_{u}\left( {u,\sigma} \right)}^{2}} \right)^{3/2}}$

-   -   where        X(u,σ)=x(u)*g(u,σ) Y(u,σ)=y(u)*g(u,σ)    -   and        X _(u)(u,σ)=x(u)*g _(u)(u,σ) X _(uu)(u,σ)=x(u)*g _(uu)(u,σ)

In the above, * represents convolution and subscripts representderivatives.

The number of curvature zero crossings changes as σ changes, and when ais sufficiently high Ψ is a convex curve with no zero crossings.

The zero crossing points (u, σ) are plotted on a graph known as the CSSimage space. This results in a plurality of curves characteristic of theoriginal outline. The peaks of the characteristic curves are identifiedand the corresponding co-ordinates are extracted and stored. In generalterms, this gives a set of n co-ordinate pairs [(x1,y1), (x2,y2), . . .(xn,yn)], where n is the number of peaks, and xi is the arc-lengthposition of the ith peak and yi is the peak height.

The order and position of characteristic curves and the correspondingpeaks as they appear in the CSS image space depends on the startingpoint for the curvature function described above. According to theinvention, the peak co-ordinates are re-ordered using a specificordering function.

Ordering is performed by a one-to-one mapping T of the peak indices {1 .. . n}to a new set of indices {1 . . . n}.

In this embodiment, the co-ordinate pairs are ordered by considering thesize of the y co-ordinates. Firstly, the highest peak is selected.Suppose the kth peak is the most prominent. Then (xk, yk) becomes thefirst in the ordered set of values. In other words, T(k)=1. Similarly,the other peak co-ordinates are re-ordered in terms of decreasing peakheight. If two peaks have the same height, then the peak having thex-co-ordinate closest to that of the preceding co-ordinate pair isplaced first. In other words, each co-ordinate pair having an originalindex i is assigned a new index j where T(i)=j and yj>=y(j+1). Also,each value xi is subjected to a cyclic shift of −xk.

As a specific example, the outline shown in FIG. 2 results in a CSSimage as shown in FIG. 3. Details of the co-ordinates of the peaks ofthe curves in the CSS image are given in Table 1 below.

TABLE 1 Peak Index X Y 1 0.124 123 2 0.68 548 3 0.22 2120 4 0.773 1001 50.901 678

The peaks are ordered using the ordering described above. In otherwords, the co-ordinates are ordered in terms of decreasing peak height.Also, the x co-ordinates are all shifted towards zero by an amount equalto the original x co-ordinate of the highest peak. This results inre-ordered peak co-ordinates as given in Table 2 below.

TABLE 2 Peak Index X Y 1 0 2120 2 0.553 1001 3 0.681 678 4 0.46 548 50.904 123

These re-ordered peak co-ordinates form the basis of the descriptorstored in the database 10 for the object outline. In this embodiment,the peak co-ordinates are stored in the order shown in Table 2.Alternatively, the co-ordinates can be stored in the original order,together with an associated indexing indicating the new ordering.

Second Embodiment

An alternative method of representing the object outline according to asecond embodiment will now be described.

A CSS representation of the outline is derived as described above.However, the ordering of the peak co-ordinates is different from theordering in Embodiment 1 described above. More specifically, firstly thehighest peak is selected. Suppose peak k is the most prominent one. Then(xk,yk) becomes the first peak in the ordered set of peaks. Thesubsequent peaks are ordered so that for peak co-ordinates of originalindex i, then T(i)=j, and xj<=x(j+1). Also, all values xi are shifteddownwards by an amount xk equal to the original x co-ordinate oforiginal peak k.

In other words, in the ordering method according to embodiment 2, thehighest peak is selected and placed first, and then the remaining peaksfollow in the original sequence starting from the highest peak.

Table 3 below shows the peak values of Table 1 ordered according to thesecond embodiment.

TABLE 3 Peak Index X Y 1 0 2120 2 0.46 548 3 0.553 1001 4 0.681 678 50.904 123

In a development of embodiments 1 and 2 described above, a confidencefactor (CF) is additionally associated with each representation of ashape. The CF is calculated from the ration of the second highest andthe highest peak values for a given shape.

For the outline shown in FIG. 2, the CF value is CF=1001/2120. In thisexample, the CF is quantized by rounding to the nearest 0.1 to reducestorage requirements. Accordingly, here CF=0.5.

The CF value in this example is a reflection of the accuracy oruniqueness of the representation. Here, a CF value close to one meanslow confidence and a CF value close to zero means high confidence. Inother words, the closer are the two highest peak values, the less likelyit is that the representation is accurate.

The CF value can be useful when performing a matching procedure, as willbe shown in the following description.

Third Embodiment

A method of searching for an object in an image in accordance with anembodiment of the invention will now be described with reference to FIG.4 which is a block diagram of the searching method.

Here, the descriptor database 10 of the system of FIG. 1 storesdescriptors derived according to the first ordering method describedabove together with associated CF values.

The user initiates a search by drawing an object outline on the displayusing the pointing device (step 410). The control unit 2 then derives aCSS representation of the input outline and orders the peak co-ordinatesin accordance with the same ordering function used for the images in thedatabase to arrive at a descriptor for the input outline (step 420). Thecontrol unit 2 then also calculates a CF value for the input outline bycalculating the ratio of the second highest peak value to the highestpeak value and quantizing the result (step 430).

The control unit 2 then compares the CF value for the input outline witha predetermined threshold (step 440). In this example, the threshold is0.75. If the CF value is lower than the threshold, indicating arelatively high confidence in the accuracy of the input descriptor, thenthe next step is to consider the CF value for the model (ie image storedin the database) under consideration. If the model CF is also lower thanthe threshold (step 450), then the input and model are compared usingthe respective descriptors in the predetermined ordering only (step460). If CF for either the input or the model is greater than thethreshold, then matching is performed by comparing all possibledifferent orderings of the co-ordinate values in the input descriptorswith the model descriptor in the database (step 470).

The matching comparison is carried out using a suitable algorithmresulting in a similarity measure for each descriptor in the database. Aknown matching algorithm such as described in the above-mentioned paperscan be used. That matching procedure is briefly described below.

Given two closed contour shapes, the image curve Ψi and the model curveΨm and their respective sets of peaks {(xi1,yi1),(xi2,yi2), . . .,(xin,yin)} and {(xm1,ym1), (xm2,ym2), . . . ,(xmn,ymn)} the similaritymeasure is calculated. The similarity measure is defined as a total costof matching of peaks in the model into peaks in the image. The matchingwhich minimises the total cost is determined using a dynamicprogramming. The algorithm recursively matches the peaks from the modelto the peaks from the image and calculates the cost of each such match.Each model peak can be matched with only one image peak and each imagepeak can be matched with only one model peak. Some of the model and orimage peak may remain unmatched, and there is an additional penalty costfor each unmatched peak. Two peaks can be matched if their horizontaldistance is less then 0.2. The cost of a match is the length of thestraight line between the two matched peaks. The cost of an unmatchedpeak is its height.

In more detail the algorithm works by creating and expanding a tree-likestructure, where nodes correspond to matched peaks:

1. Create starting node consisting of the largest maximum of the image(xik, yik) and the largest maximum of the model (xir,yir).

2. For each remaining model peak which is within 80 percent of thelargest maximum of the image peaks create an additional starting node.

3. Initialise the cost of each starting node created in 1 and 2 to theabsolute difference of the y-coordinate of the image and model peakslinked by this node.

4. For each starting node in 3, compute the CSS shift parameter alpha,defined as the difference in the x (horizontal) coordinates of the modeland image peaks matched in this starting node. The shift parameter willbe different for each node.

5. For each starting node, create a list of model peaks and a list ofimage peaks. The list hold information which peaks are yet to bematched. For each starting node mark peaks matched in this node as“matched”, and all other peaks as “unmatched”.

6. Recursively expand a lowest cost node (starting from each nodecreated in steps 1-6 and following with its children nodes) until thecondition in point 8 is fulfilled. To expand a node use the followingprocedure:

7. Expanding a node:

If there is at least one image and one model peak left unmatched:

-   -   select the largest scale image curve CSS maximum which is not        matched (xip,yip). Apply the starting node shift parameter        (computed in step 4) to map the selected maximum to the model        CSS image—now the selected peak has coordinates (xip-alpha,        yip). Locate the nearest model curve peak which is unmatched        (xms,yms). If the horizontal distance between the two peaks is        less then 0.2 (i.e: |xip-alpha-xms|<0.2), match the two peaks        and define the cost of the match as the length of the straight        line between the two peaks. Add the cost of the match to the        total cost of that node. Remove the matched peaks from the        respective lists by marking them as “matched”. If the horizontal        distance between the two peaks is greater than 0.2, the image        peak (xip,yip) cannot be matched. In that case add its height        yip to the total cost and remove only the peak (xip,yip) from        the image peak list by marking it as “matched”.

Otherwise (There are only image peaks or there are only model peaks leftunmatched):

Define the cost of the match as the height of the highest unmatchedimage or model peak and remove that peak from the list.

8. If after expanding a node in 7 there are no unmatched peaks in boththe image and model lists, the matching procedure is terminated. Thecost of this node is the similarity measure between the image and modelcurve. Otherwise, go to point 7 and expand the lowest cost node.

The above procedure is repeated with the image curve peaks and the modelcurve peaks swapped. The final matching value is the lower of the two.

As another example, for each position in the ordering, the distancebetween the input x value and the corresponding model x value and thedistance between the input y value and the corresponding model y valueare calculated. The total distance over all the positions is calculatedand the smaller the total distance, the closer the match. If the numberof peaks for the input and the model are different, the peak height forthe leftovers is included in the total distance.

The above steps are repeated for each model in the database (step 480).

The similarity measures resulting from the matching comparisons areordered (step 490) and the objects corresponding to the descriptorshaving similarity measures indicating the closest match (i.e. here thelowest similarity measures) are then displayed on the display unit 4 forthe user (step 500). The number of objects to be displayed can bepre-set or selected by the user.

In the above embodiment, if the CF value is greater than the threshold,then all possible orderings of the input descriptor values areconsidered in the matching. It is not necessary to consider all possibleorderings, and instead only some possible orderings may be considered,such as some or all cyclic shifts of the original CSS representation.Furthermore, in the above embodiment, the threshold value is set to0.75, but the threshold can be set to different levels. For example, ifthe threshold is set to zero, then all matches are performed by analysisof some or all possible orderings. This increases the amount ofcomputation required compared with case when threshold is above zero,but since the peaks have already been ordered and their x-coordinateadjusted for a particular starting point or object rotation, the amountof computation required is reduced compared with the original systemwhere no such adjustment has been made. Consequently, by setting thethreshold to zero the system offers some reduction in computational costand the retrieval performance is exactly the same as in the originalsystem.

Alternatively, if the threshold is set to one, then matching isperformed using only the stored ordering. There is then a significantreduction in computation required, with only a small deterioration inretrieval accuracy.

Various modifications of the embodiments described above are possible.For example, instead of ordering the CSS peak co-ordinate values asdescribed in embodiments 1 and 2 other orderings can be used. Forexample, the values can be placed in order of increasing rather thandecreasing peak height. Instead of storing the ordered values in thedatabase, the ordering can be carried out during the matching procedure.

INDUSTRIAL APPLICABILITY

A system according to the invention may, for example, be provided in animage library. Alternatively, the databases may be sited remote from thecontrol unit of the system, connected to the control unit by a temporarylink such as a telephone line or by a network such as the internet. Theimage and descriptor databases may be provided, for example, inpermanent storage or on portable data storage media such as CD-ROMs orDVDs.

Components of the system as described may be provided in software orhardware form. Although the invention has been described in the form ofa computer system, it could be implemented in other forms, for exampleusing a dedicated chip.

Specific examples have been given of methods of representing a 2D shapeof an object and of methods for calculating values representingsimilarities between two shapes but any suitable such methods can beused.

The invention can also be used, for example, for matching images ofobjects for verification purposes, or for filtering.

1. A method of representing an object appearing in an image or asequence of images, by processing signals corresponding to the image,the method comprising: deriving a plurality of peak coordinate values ofa curvature scale space (CSS) representation of the object by smoothingan outline of the object in a plurality of stages starting from anarbitrary point on the outline, and ordering the peak co-ordinate valuesof the CSS representation on the basis of peak height values of theplurality of peak co-ordinates, the peak height values corresponding toa parameter used for smoothing the outline.
 2. A method as claimed inclaim 1, wherein said ordering includes generating a representation ofthe outline that is independent of a starting point on the outline.
 3. Amethod as claimed in claim 1, wherein said ordering includes orderingthe peak height values starting from the greatest value.
 4. A method asclaimed in claim 3 wherein said ordering includes ordering the peakheight values in decreasing size.
 5. A method as claimed in claim 1,wherein said ordering includes ordering the peak height values startingfrom the smallest value.
 6. A method as claimed in claim 1, furthercomprising: producing a descriptor from said ordering of the peak heightvalues, and storing the descriptor.
 7. A method as claimed in claim 6,wherein said storing includes storing the descriptor in a database.
 8. Aapparatus arranged to implement a method as claimed in claim
 1. 9. Acomputer system programmed to operate according to a method as claimedin claim
 1. 10. A computer-readable storage medium storingcomputer-executable procedures for implementing a method as claimed inclaim
 1. 11. A method for representing an object appearing in an image,comprising: identifying at least one object outline; determining acurvature scale space representation for said outline, by smoothing theoutline in a plurality of stages, to generate peak coordinates for thecurvature scale space representation; and ordering said peak coordinatesbased on peak height value, corresponding to a parameter used forsmoothing the outline, to generate a shape descriptor for said outline.12. The method of claim 11, further comprising: storing said shapedescriptor as a description for said object in a memory.
 13. A methodfor representing an object appearing in an image, comprising:identifying at least one object outline; determining a curvature scalespace representation for said outline, by smoothing the outline in aplurality of stages, to generate peak coordinates for the curvaturescale space representation; and ordering said peak coordinates, byselecting highest peak value and associated highest peak coordinates andordering remaining peak coordinates in decreasing peak height, togenerate a shape descriptor for said outline wherein said highest peakvalue and other peak values corresponding to a parameter used forsmoothing the outline.
 14. The method of claim 13, further comprising:storing said shape descriptor as a description for said object in amemory.
 15. A method for representing an object appearing in an image,comprising: identifying at least one object outline; determining acurvature scale space representation for said outline, by smoothing theoutline in a plurality of stages, to generate peak coordinates for thecurvature scale space representation; and ordering said peakcoordinates, by selecting highest peak value and associated peakcoordinates and ordering remaining peak coordinates in relation tox-coordinate values by shifting x-coordinates of the remaining peakcoordinates in relation to x-coordinate associated with said highestpeak value, to generate a shape descriptor for said outline wherein saidhighest peak value and other peak values corresponding to a parameterused for smoothing the outline.
 16. The method of claim 15, furthercomprising: storing said shape descriptor as a description for saidobject in a memory.
 17. A method for representing an object appearing inan image, comprising: identifying at least one object outline;determining a curvature scale space representation, by smoothing theoutline in a plurality of stages, for said outline to generate aplurality of curves representative of said outline; determining peaksand associated peak coordinates for said plurality of curves; andordering said peak coordinates, by selecting highest peak value andassociated peak coordinates and shifting the x-coordinate associatedwith said highest peak value to a value of zero, and ordering remainingpeak coordinates in relation to x-coordinate values by shiftingx-coordinates of the remaining peak coordinates in relation to saidshifted x-coordinate associated with said highest peak value, togenerate a shape descriptor for said outline wherein said highest peakvalue and other peak values corresponding to a parameter used forsmoothing the outline.
 18. A method for representing an object appearingin an image, comprising: identifying at least one object outline;determining a curvature scale space representation, by smoothing theoutline in a plurality of stages, for said outline to generate peakcoordinates for the outline curvature scale space representation,wherein said peak coordinates are determined using a plural stage filterthat produces derivative curves representative of said outline byconvolving said object outline; and ordering said peak coordinates, byselecting highest peak value and associated highest peak coordinates andordering remaining peak coordinates in decreasing peak height, togenerate a shape descriptor for said outline wherein said highest peakvalue and other peak values corresponding to a parameter used forsmoothing the outline.